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Brain imaging studies indicate that speech motor areas are recruited for auditory speech 
perception, especially when intelligibility is low due to environmental noise or when 
speech is accented. The purpose of the present study was to determine the relative 
contribution of brain regions to the processing of speech containing phonetic categories 
from one's own language, speech with accented samples of one's native phonetic 
categories, and speech with unfamiliar phonetic categories. To that end, native English and 
Japanese speakers identified the speech sounds If I and /I/ that were produced by native 
English speakers (unaccented) and Japanese speakers (foreign-accented) while functional 
magnetic resonance imaging measured their brain activity. For native English speakers, the 
Japanese accented speech was more difficult to categorize than the unaccented English 
speech. In contrast, Japanese speakers have difficulty distinguishing between /r/ and /I/, so 
both the Japanese accented and English unaccented speech were difficult to categorize. 
Brain regions involved with listening to foreign-accented productions of a first language 
included primarily the right cerebellum, left ventral inferior premotor cortex PMvi, and 
Broca's area. Brain regions most involved with listening to a second-language phonetic 
contrast (foreign-accented and unaccented productions) also included the left PMvi and 
the right cerebellum. Additionally, increased activity was observed in the right PMvi, the 
left and right ventral superior premotor cortex PMvs, and the left cerebellum. These results 
support a role for speech motor regions during the perception of foreign-accented native 
speech and for perception of difficult second-language phonetic contrasts. 
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INTRODUCTION 

A growing body of research suggests that speech motor areas are 
recruited to facilitate auditory speech perception when the acous- 
tic signal is degraded or masked by noise (Callan et al., 2010; 
Schwartz et al, 2012; Adank et al., 2013; Moulin-Frier and Arbib, 
2013). Researchers hypothesize that auditory speech signals are 
translated into internally simulated articulatory control signals 
(articulatory- auditory internal models), and that these internal 
simulations help to constrain speech perception (Callan et al., 
2004a; Wilson and Iacoboni, 2006; Skipper et al., 2007; Iacoboni, 
2008; Poeppel et al, 2008; Rauschecker, 2011; Schwartz et al, 
2012). Indeed, brain imaging studies have demonstrated that 
activity increases in speech motor areas when participants listen 
to speech in noise relative to when they listen in noise-free con- 
ditions (Callan et al., 2003a, 2004b). Increased activity in speech 
motor areas has also been observed when listeners identify pho- 
netic categories that are not in their first language (non-native), 
relative to the activity observed when they identify phonetic cat- 
egories from their first language (native) (Callan et al., 2003b, 



2004a, 2006a; Wang et al., 2003). Moreover, activity in speech 
motor areas has been found to increase when participants lis- 
ten to sentences in their first language when they are spoken in 
an unfamiliar accent (Adank et al., 2013). These observations, as 
well as observations from other studies that have demonstrated 
that speech motor brain regions are responsive to both produc- 
tion and perception of speech, support motor simulation theories 
of speech perception (Callan et al., 2000, 2006b, 2010; Wilson 
et al, 2004; Nishitani et al, 2005; Meister et al, 2007). In this 
study, we investigated the neural processes involved in the per- 
ception of phonetic categories from one's first language produced 
by native speakers, as well as those produced by speakers with a 
foreign-language accent. We compared the neural activity in these 
conditions to the activity observed when participants perceived 
phonetic categories from their second language (again, both pro- 
duced by a native speaker of that second language, and produced 
by a speaker with a foreign-language accent). 

Adults often have considerable difficulty discriminating and 
identifying many non-native phonetic categories in their second 
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language that overlap with a single phonetic category in their first 
(native) language, even after years of exposure to that second lan- 
guage (Miyawaki et al., 1975; Trehub, 1976; Strange and Jenkins, 
1978; Werker et al., 1981; Werker and Tees, 1999). The English 
Irl and III phonetic contrast is an example of a difficult non- 
native phonetic contrast for native Japanese speakers (Miyawaki 
et al., 1975). Intensive phonetic identification training can result 
in long-term improvement in speech perception that general- 
izes to novel stimuli (Lively et al, 1994; Akahane-Yamada, 1996; 
Bradlow et al, 1999). Perceptual identification training can also 
lead to improvements in production (Bradlow et al., 1997), even 
in the absence of formal production training. The observation 
that perceptual improvements lead to production improvements 
suggests that a perceptual-motor component may be respon- 
sible for the improved phonetic identification. Indeed, several 
brain-imaging studies support the hypothesis that neural pro- 
cesses associated with speech production constrain and facilitate 
phoneme identification (Callan et al, 2004a, 2010; Skipper et al, 
2007). 

Similar to the difficulties listeners have discriminating and 
identifying non-native phonetic contrasts in a second language, 
foreign-accented native speech is often difficult for a native 
speaker of the language to perceive (Goslin et al., 2012; Adank 
et al., 2013; Moulin-Frier and Arbib, 2013). Recent evidence 
suggests that speech motor processes are recruited to facilitate 
perception when listening to foreign-accented productions of a 
language (Adank et al., 2013; Moulin-Frier and Arbib, 2013). For 
example, Adank et al. (2013) found evidence for sensorimotor 
integration during processing of foreign-accented speech when 
they asked one group of participants to imitate the unfamiliar 
foreign-accent of a speaker who uttered sentences in the partici- 
pants' first language, and compared their brain activity to another 
group of participants who repeated the same sentences in their 
own native accent. Adank et al. (2013) compared the levels of 
activation in the speech motor regions of the brain (including 
the inferior frontal gyrus, and Broca's area) when participants 
listened to sentences before a production task, to the levels of 
activation observed when participants listened to sentences after 
a production task. Larger differences in speech motor activity 
were observed for the participants who imitated the unfamil- 
iar, foreign -accented speech, compared to the participants who 
repeated the sentences in their own accent, specifically when the 
participants listened to the sentences before compared to after the 
production task. 

The goal of the present study was to differentiate the neu- 
ral processes that are involved in the perception of phonetic 
categories in a second language (non- native), from the neural 
processes involved in the perception of foreign-accented produc- 
tions of phonetic categories from one's first language. In this 
study, native English (Eng) and Japanese (Jpn) speakers listened 
to native English ("unaccented") and Japanese ("accented") pro- 
ductions of English syllables that began with either Irl or III . 
The Japanese productions of the English syllables (accented) used 
for the study were found to have a confusion rate (misidentified 
as the wrong syllable) of 29% when presented to native English 
speakers. The Japanese-accented productions could be perceived 
as either Irl or III by native English speakers on a proportion 



of the trials. The native English speakers were more accurate 
at identifying the unaccented English speech stimuli than the 
Japanese-accented speech stimuli. In contrast, the native Japanese 
speakers had difficulty identifying both the English-unaccented 
speech stimuli and the Japanese-accented stimuli. The follow- 
ing contrasts were investigated: ( 1 ) The neural processes that are 
involved in the perception of foreign-accented productions of a 
first language phonetic category were investigated using the con- 
trast Eng(accented - unaccented) - Jpn(accented - unaccented). 
Subtracting the activity observed in the Jpn group controlled for 
general stimulus variables. (2) The contrast of Eng(accented) - 
Eng(unaccented) investigated which areas were involved in pro- 
cessing a difficult native phonetic identification task (accented) 
compared to those involved in processing an easy phonetic iden- 
tification task (unaccented), without the potential confound of 
extraneous between group differences. However, acoustic stimu- 
lus characteristics were not controlled for by this contrast. (3) The 
neural processes selective for the perception of foreign-accented 
productions of a second language phonetic category, compared 
to foreign-accented productions of a first language phonetic 
category, were investigated using the contrast Jpn(accented) - 
Eng(accented). This contrast controlled for the neural processes 
that were related to task difficulty, such as attention and verbal 
rehearsal. (4) To investigate the overall neural processes involved 
in the perception of (native) unaccented productions of a second 
language phonetic category relative to the perception of unac- 
cented productions of a first language phonetic category, we used 
the contrast Jpn(unaccented) - Eng(unaccented). This contrast 
did not control for task difficulty. All three of the contrasts above 
controlled for general processes related to performing a categori- 
cal perceptual identification task using a button response, though 
only the Jpn(accented) - Eng(accented) contrast additionally 
controlled for task difficulty. 

A number of brain regions have been shown to be involved 
with the perception of unaccented/native productions of a second 
language phonetic category (Callan et al., 2003a, 2004a, 2006a; 
Wang et al, 2003) as well as foreign-accented speech (Adank et al, 
2013). These regions include, but are not limited to: the ven- 
tral inferior premotor cortex including Broca's area (PMvi), the 
ventral superior and dorsal premotor cortex (PMvs/PMd), the 
superior temporal gyrus/sulcus (STG/S), and the cerebellum. If 
the neural processes involved in processing difficult-to-perceive 
speech sounds are dependent on the relative contribution of 
regions involved in articulatory planning control, then one might 
predict that the brain regions involved with speech motor control 
(PMvi/Broca's, PMvs/PMd, and the cerebellum) would be more 
active than regions involved with auditory processing (STG/S) 
when general acoustic differences in the stimuli are controlled. 

As previously mentioned, the brain regions involved with 
internally simulating speech production (internal models) are 
hypothesized to constrain and facilitate speech perception, espe- 
cially under degraded conditions (e.g., speech in noise, non- 
native speech) (Callan et al., 2003b, 2004a; Iacoboni and Wilson, 
2006; Wilson and Iacoboni, 2006; Skipper et al., 2007; Iacoboni, 
2008; Rauschecker and Scott, 2009; Rauschecker, 2011; Callan 
et al, 2014). Internal models are thought to simulate the 
input/output characteristics, or their inverses, of the motor 
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control system (Kawato, 1999). With regards to speech pro- 
duction, inverse internal models predict the motor commands 
necessary to articulate a desired auditory (and/or orosensory) 
target (auditory- to-articulatory mapping). Forward internal 
models, conversely, predict the auditory (and/or orosensory) 
consequences of simulated speech articulation (articulatory-to- 
auditory mapping). It has been proposed that both forward and 
inverse internal models constrain and facilitate speech perception, 
especially under degraded conditions (Callan et al., 2004a, 2014; 
Rauschecker and Scott, 2009; Rauschecker, 2011). Facilitation is 
achieved by a process akin to analysis-by-synthesis (Stevens, 2002; 
Poeppel et al., 2008) (forward internal models: articulatory-to- 
auditory prediction) and synthesis-by-analysis (inverse internal 
models: auditory-to-articulatory prediction), specifically by com- 
petitive selection of the speech unit (phoneme, syllable, etc.) 
that best matches the ongoing auditory signal (or visual signal, 
in the case of audiovisual or visual-only speech). Brain regions 
thought to be involved with instantiating these articulatory-to- 
auditory and auditory-to-articulatory internal models include 
speech motor areas such as the PMC and Broca's area, the pos- 
terior regions of the STG/S, the IPL, and the cerebellum. In 
particular, the cerebellum, has been shown to instantiate internal 
models for motor control (Kawato, 1999; Imamizu et al., 2000), 
and there is evidence that it instantiates internal models related to 
speech (Callan et al, 2004a, 2007; Rauschecker, 2011; Tourville 
and Guenther, 2011; Callan and Man to, 2013). Brain activity 
in these regions (including the PMC, Broca's area, the IPL, and 
the cerebellum) during speech perception tasks has been used as 
evidence to support the involvement of motor processes during 
speech perception. 

One potential criticism of ascribing activity found in speech 
motor regions to speech perception is that many of these same 
regions are known to be more active as a function of task dif- 
ficulty. Activity in brain regions such as the IFG, the PMC, and 
the cerebellum has been shown to increase with task-related 
attentional demands and working memory (including verbal 
rehearsal) (Jonides et al., 1998; Davachi et al, 2001; Sato et al., 
2009; Alho et al., 2012). As has been previously suggested (Hickok 
and Poeppel, 2007; Poeppel et al, 2008; Lotto et al, 2009; 
Scott et al, 2009), activity in these speech motor regions may 
not be related to speech perception intelligibility, but rather to 
other processes related to task difficulty. If these brain regions 
involved with speech motor processing are increasingly more 
active as a function of task difficulty, one would predict that 
subjects with worse phonetic identification performance (greater 
task difficulty) would show increased activity in these regions 
compared to subjects with better phonetic identification perfor- 
mance. However, the opposite result has been found, with an 
increase in PMC, IFG, and cerebellum activity associated with 
better phonetic identification performance on a difficult non- 
native phonetic category (Callan et al, 2004a). Similarly, PMC 
activity has been shown to be more active for correct compared 
to incorrect trials during a phonetic identification in noise task 
(Callan etal, 2010). 

It is hypothesized that the perception of foreign-accented first 
language phonetic categories depends on the brain regions that 
instantiate the auditory — articulatory representation of phonetic 



categories. Research suggests that these regions include left hemi- 
sphere Broca's area and the PMC. In the case of the perception 
of second-language phonetic categories — for which the distinct 
second-language phonemes are subsumed within a single pho- 
netic category in the native language (e.g., English /r/ and III for 
native Japanese speakers) — additional neural processes may be 
recruited to establish new phonetic categories without interfering 
with the established native phonetic category. It is hypothesized 
that the establishment of these second-language phonetic cate- 
gories (when the second-language is acquired after childhood) 
involves greater reliance on general articulatory-to-auditory feed- 
back control systems, which generate auditory predictions based 
on articulatory planning, and are thought to be instantiated in 
right hemisphere PMC (Tourville and Guenther, 2011; Guenther 
and Vladusich, 2012). 

METHODS 
SUBJECTS 

Thirteen right-handed native Japanese (Jpn) speakers with some 
English experience (at least 6 years of classes in junior and 
senior high school) and thirteen right-handed native English 
(Eng) speakers participated in this study. The native Japanese- 
speaking subjects were nine females and four males whose ages 
ranged from 23 to 37 years (M = 30.4 years, SD = 4.5). The 
native English-speaking subjects were one female and twelve 
males whose ages ranged from 21 to 39 years (M = 27.8 years, 
SD = 5.1). All subjects included in this study scored significantly 
above chance when they identified the Irl and III productions 
of a native English speaker, which ensured that all subjects were 
actively trying to do the task. Subjects were paid for their partici- 
pation, and gave written informed consent for the experimental 
procedures, which were approved by the ATR Human Subject 
Review Committee in accordance with the principles expressed 
in the Declaration of Helsinki. 

STIMULI AND PROCEDURE 

The stimuli were acquired from the speech database compiled 
by the Department of Multilingual Learning (ATR — HIS, Kyoto, 
Japan). The experiment had two, within-subject conditions: a 
foreign-accented speech condition and an unaccented speech 
condition. These two conditions were composed of audio speech 
stimuli consisting of English syllables beginning with a hi or III, 
which were followed by five different following English vowel con- 
texts (/a, e, i, o, u/). There were three occurrences of each syllable 
for each accent condition for a total of 60 trials in the experiment. 
All stimuli were recorded digitally in an anechoic chamber with 
a sampling rate of 44,100 Hz. The unaccented speech was taken 
from samples of female and male native English speakers. The 
foreign-accented speech was taken from samples of female and 
male native Japanese speakers that produced Irl-IV confusions 
(M = 29%, SD = 13%), as determined by a forced-choice iden- 
tification task performed by native English speakers (the number 
of evaluators ranged from 6 to 10 individuals, depending on the 
stimulus). Both the foreign-accented and unaccented It I and III 
stimuli consisted of six female voices and nine male voices. The 
stimuli were down-sampled to 22,050 Hz for presentation during 
the experiment. 
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The fMRI procedure consisted of an event-related design 
in which the sequence of presentation of the various stimulus 
conditions (unaccented Irl, unaccented III, foreign-accented Irl, 
foreign-accented III, and /null trial/) was generated stochastically 
using SPM99 (Wellcome Department of Cognitive Neurology, 
UCL). An event-related design was employed so that the var- 
ious stimulus conditions could be presented (approximately 
85-90 dB SPL) in a pseudo-random order. This ensured that 
subjects could not predict which stimulus would occur dur- 
ing the subsequent presentation. Stimuli were presented (syn- 
chronized with fMRI scanning using Neurobehavioral System's 
Presentation software) via MR-compatible headphones (Hitachi 
Advanced Systems' ceramic transducer headphones; frequency 
range 30-40,000 Hz, approximately 20 dB SPL passive attenua- 
tion). Subjects identified whether the stimuli started with Irl 
or III, and indicated which they perceived by pressing a but- 
ton with their left thumb. The left hand was used instead of 
the right hand so that brain activity in left Broca's area and 
left PMC could be better identified, with less influence of activ- 
ity associated with the button-press motor response. The iden- 
tity of the buttons was counterbalanced across subjects. Stimuli 
were presented at a rate of approximately 2250 ms in a pseudo- 
random order dependent on the event sequence. Subjects were 
asked to respond quickly to minimize differences in the hemo- 
dynamic response resulting from long response times (Poldrack, 
2000). However, they were not asked to respond as quickly 
as they could, therefore response latencies were not evaluated. 
Null trials in which only silence occurred were also included 
and used as a baseline condition. Subjects were not given 
online feedback regarding the correctness of their responses. 
All subjects were given a practice session outside of the scan- 
ner using stimuli similar to those used in the experimental 
session. 

Each subject participated in multiple experiments, including 
the present study, within the same insertion into the fMRI scan- 
ner. The order of the different experiments was counterbalanced 
across subjects. Depending on the number of experiments in 
which a subject participated, the total time in the scanner ranged 
from approximately 30-60 min. The session lasted approximately 
7 min for this experiment. 

fMRI DATA COLLECTION AND PREPROCESSING 

For functional brain imaging, Shimadzu-Marconi's Magnex 
Eclipse 1.5T PD250 was used at the ATR Brain Activity Imaging 
Center. Functional T2* weighted images were acquired using a 
gradient echo-planar imaging sequence (echo time 55 ms; repeti- 
tion time 2000 ms; flip angle 90°). A total of 20 contiguous axial 
slices were acquired with a 3 x 3 x 6 mm voxel resolution cov- 
ering the cortex and cerebellum. For some subjects, 20 slices was 
not a sufficient number to cover the entire cortex and thus the top 
part of the cortex was missing. As a result, the analyses conducted 
in this study do not include the top part of the cortex. A total of 
304 scans were taken during a single session. Images were pre- 
processed using programs within SPM8 (Wellcome Department 
of Cognitive Neurology, UCL). Differences in acquisition time 
between slices were accounted for; images were realigned and spa- 
tially normalized to a standard space using a template EPI image 



(3 x 3 x 3 mm voxels), and were smoothed using a 6 x 6 x 
12 mm FWHM Gaussian kernel. 

STATISTICAL IMAGE ANALYSIS 

Regional brain activity for the various conditions was assessed 
with a general linear model using an event-related design. 
Realignment parameters were used to regress out movement- 
related artifacts. In addition, low-pass filtering, which used the 
hemodynamic response function, was employed. The event- 
related stochastic design used to model the data included null 
responses and a stationary trial occurrence probability. A mixed- 
effects model was employed. A fixed-effect analysis was first 
employed for all contrasts of interest across data from each subject 
separately. The contrasts of interest for both the Jpn and Eng sub- 
jects included: unaccented speech relative to baseline; accented 
speech relative to baseline; and accented relative to unaccented 
speech. At the random effects level between subjects, the con- 
trast image of the parameter estimates of the first level analysis for 
each subject was used as input for a SPM model employing two- 
sample f-tests. The contrasts of interest consisted of the following: 
(1) Processes related to the perception of first language pho- 
netic contrasts in accented speech Eng(accented - unaccented) - 
Jpn(accented - unaccented); (2) Processes related to the percep- 
tion of first language accented speech (difficult task) relative to 
first language unaccented speech (easy task). (3) Processes related 
to the perception of foreign-accented speech Jpn(accented) - 
Eng(accented) and (4) Processes related to the perception of 
unaccented productions of a second language phonetic category 
Jpn(unaccented) - Eng(unaccented). Because the study is quasi- 
experimental in the sense that assignment of participant into Eng 
and Jpn groups is not random, the variance not attributable to 
the independent experimental variables (e.g., educational expe- 
rience and cultural differences related to carrying out the tasks) 
may significantly influence participants' performance and neu- 
ral responses, which could potentially confound the results. To 
ensure that the differential brain activity related to the contrasts 
of interest (given above) were not the result extraneous neural 
processes involved with behavioral performance, task difficulty, 
and/or variables arising from the quasi-experimental design, the 
random-effects analyses were conducted using the raw percent 
correct phonetic identification performance scores as a covariate 
of non-interest. 

A False Discovery Rate (FDR) correction for multiple compar- 
isons across the entire volume was employed with a threshold of 
pFDR < 0.05 using a spatial extent greater than 5 voxels. If no 
voxels were found to be significant using the FDR, a correction 
threshold of p < 0.001 uncorrected with a spatial extent threshold 
greater than 5 voxels was used. Region of interest (ROI) anal- 
yses were conducted using MNI coordinates for the PMvi/IFG 
(left -51,9,21; right 51,15,18), the PMvs (left -36,-3,57; right 
27,-3,51), the STG/S (left -57,-39,9) and the cerebellum 
(left -27,-63,-39; right 30,-66,-33) given that in Callan et al. 
(2004a) these regions were found to be involved in processing 
difficult-to-perceive speech contrasts. It should be noted that 
these coordinates (for PMvi/IFG and STG/S) fall within the clus- 
ter of activity in regions found to be active for perception of 
accented speech, as reported by Adanket al. (2013). Small volume 
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correction for multiple comparisons was carried out using the 
seed voxels reported above within a sphere with a radius of 8 mm. 
The location of active voxels was determined by reference to the 
Talairach atlas (Talairach and Tournoux, 1988) as well as by using 
the Anatomy Toolbox within SPM8. Activity in the cerebellum 
was localized with reference to the atlas given by Schmahmann 
etal. (2000). 

RESULTS 

BEHAVIORAL PERFORMANCE 

The results of the two -alternative forced-choice phoneme iden- 
tification task (in percent correct) were analyzed across subjects 
using an AN OVA with the two factors of language group (Jpn 
and Eng) and accent (unaccented and accented). Bonferroni 
corrections for multiple comparisons were used to determine 
statistical significance at p < 0.05 for all behavioral analyses con- 
ducted. The results are as follows: the interaction between Jpn 
and Eng subjects for accented and unaccented stimuli was signif- 
icant [Eng unaccented: M = 94.6%, SE = 0.6; Jpn unaccented: 
M = 69.5%, SE = 1.8; Eng accented: M = 65.0%, SE = 1.5; Jpn 
accented: M = 62.5%, SE = 2.8; F ( i, 48) = 40.2, p < 0.05 cor- 
rected] (see Figure 1). The main effect of group (Eng > Jpn) 
was significant [Eng M = 79.8%, SE = 3.11, Jpn M = 66.0%, 
SE = 1.8, F(! 4g ) = 60.3, p < 0.05 corrected]. The main effect 
of accent (unaccented > accented) was also significant [unac- 
cented: M = 82.1%, SE = 2.9, accented: M = 63.7%, SE = 1.1, 
_F(! 4g ) = 106.4, p < 0.05 corrected]. The identification perfor- 
mance on the two -alternative forced-choice task was significantly 
greater than chance for the unaccented and accented conditions 
for both Eng and Jpn subjects (see Figure 1) [Jpn unaccented: 
T( 12 ) = 7.2, p < 0.05 corrected; Jpn accented: T( 12 ) = 7.2, p < 
0.05 corrected; Eng unaccented: T( 12 ) = 75.1, p < 0.05 corrected; 
Eng accented: T(i2) = 10.7, p < 0.05 corrected]. The Eng sub- 
jects had significantly better performance than the Jpn subjects 
for the unaccented speech stimuli condition [T(r2) = 9.4; p < 
0.05 corrected]. For accented stimuli, there was no significant 
difference for identification (evaluated based on the intended 



production of the stimuli) between native English speaking sub- 
jects and native Japanese speaking subjects [T(24) = 1.1; p = 0.27 
uncorrected]. There was also no significant difference between 
Eng subjects' performance for the accented stimuli and Jpn sub- 
jects' performance for the unaccented stimuli [T( 2 4) = 1.13, p = 
0.15 uncorrected]. For Eng subjects there was a significant dif- 
ference between performance for the unaccented and accented 
stimuli [T( 12) = 18.2, p < 0.05 corrected] . The difference for Jpn 
subjects between the performance for unaccented and accented 
stimuli was not significant when corrections were made for mul- 
tiple comparisons, but the difference was significant using an 
uncorrected threshold [T(i 2 ) = 3.3, p < 0.01 uncorrected]. 

BRAIN IMAGING 

The random effects one-sample f-test of the unaccented and 
accented condition relative to the null condition (background 
scanner noise) was carried out separately for Jpn and Eng groups. 
A FDR correction for multiple comparisons across the entire vol- 
ume was used with a threshold of pFDR < 0.05 (spatial extent > 
5 voxels). The results for unaccented and accented conditions for 
both the Eng and Jpn groups (see Figures 2A-D) indicated exten- 
sive activity in regions of the brain known to be involved with 
speech processing bilaterally (STG/S, including primary auditory 
cortex, MTG, SMG, Broca's area, PMC, medial frontal cortex 
MFC/pre-suplementary motor area pre-SMA, anterior cingulate 
cortex ACC, cerebellar lobule VI, cerebellar Crus I). Activity asso- 
ciated with the motor response of pushing the button with the left 
thumb was also present for both the Jpn and Eng groups in the 
right motor and somatosensory cortex. The conjunction analysis, 
which determined the intersection of active voxels for all condi- 
tions thresholded at pFDR < 0.05, showed activity in most of the 
above-mentioned regions (see Figure 2G and Table 1). 

The interaction effect between the factors of language group 
and accent is discussed below. The main effect of accent (accented 
vs. unaccented) did not show any significant differential activity 
using a corrected threshold of pFDR < 0.05 or an uncorrected 
threshold of p < 0.001 (spatial extent > 5 voxels). The main 
effect of language group (Jpn vs. Eng, see Figure 2H and Table 2) 
showed significant differential activity for Japanese > English 
(red) p < 0.001 (spatial extent > 5 voxels), predominantly in left 
and right PMvi/Broca's area, PMvs/PMd, the postcentral gyrus, 
the cerebellum, and the left inferior parietal lobule. The signifi- 
cant differential activity for Eng > Jpn (blue) p < 0.001) (spatial 
extent > 5 voxels) was present predominantly in the medial 
frontal gyrus, the middle frontal gyrus, the anterior cingulate 
cortex, and the middle cingulate cortex. 

The contrast of accented relative to unaccented speech was car- 
ried out separately for Eng and Jpn subjects. For both Eng and 
Jpn subjects, no significant activity was found using a corrected 
threshold of pFDR < 0.05; therefore, a threshold of p < 0.001 
uncorrected was used. For Eng subjects, activity was found to be 
present in left PMvi/Broca's area, right PMvs/PMD, left Broca's 
area BA 45, left IFG BA 47, the pre-SMA, and left and right 
cerebellar lobules VI and Vila (see Figure 2E and Table 3). The 
results of the region of interest analysis (ROI) using small volume 
correction for multiple comparisons revealed significant activity 
in the left and right cerebellum lobule VI, and a trend toward 



Behavioral Phonetic Identification 
Performance 




Unaccented Accented 



FIGURE 1 | Mean percent correct behavioral phonetic (/r/ vs. /I/) 
identification performance for the English (blue) and Japanese (red) 
groups for unaccented and foreign-accented speech. Standard error of 
the mean is given above each bar. All conditions were significantly above 
chance performance of 50%. See text for additional contrasts that were 
statistically significant. 
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Unaccented 



Accented 



Accented 
Relative to 
Unaccented 





FIGURE 2 | Significant brain activity (thresholded at pFDR < 0.05 
corrected) for the contrast of (A) Eng (unaccented), (B) Jpn 
(unaccented), (C) Eng (accented), and (D) Jpn (accented). All 

contrasts showed activity bilaterally in premotor cortex and Broca's area, 
the superior temporal gyrus/sulcus, the inferior parietal lobule, the pre- 
supplementary motor area pre-SMA, and the cerebellum. The 
conjunction analysis, shown in (G), confirmed these regions were active 
for all conditions (E). The contrast of accented Relative to unaccented 



thresholded at p < 0.001 uncorrected for Jpn showed activity in the left 
inferior frontal gyrus in Broca's area 44, the right dorsal premotor 
cortex, the pre-SMA, and the cerebellum bilaterally (F). The contrast of 
accented relative to unaccented for the Jpn group did not show any 
significant activity thresholded at p < 0.001 uncorrected. The main effect 
of language group (Japanese vs. English) is shown in (H), red 
corresponds to activity thresholded at p < 0.001 for Japanese > English 
and blue corresponds to activity for English > Japanese. 



significant activity in the left PMvi/Broca's, the right PMvs/PMd, 
and the left STG/S (see Table 4). To ensure that the differential 
brain activity reported in the analyses of this study was not just 
the result extraneous neural processes involved with (or result- 
ing from) behavioral performance, task difficulty (e.g., attention, 
working memory, concentration and/or response confidence), 
and/or variables arising from the quasi-experimental design, the 
same analyses were conducted using phonetic identification per- 
formance as a covariate of non-interest. The results of the contrast 
Eng(accented) - Eng(unaccented) using phonetic identification 
performance as a covariate of non interest showed activity in left 
PMvi/Broca's area, left Broca's BA 45, pre-SMA, right cerebellum 
Lobule VI, and left cerebellum lobule VII (see Table 3). The ROI 
analysis using phonetic identification performance as a covariate 
of non-interest revealed significant activity in left and right cere- 
bellum lobule VI, and a trend toward significant activity in left 
PMvs (p < 0.057) (see Table 4). No significant activity was found 
for Jpn subjects using a threshold of p < 0.001 uncorrected or for 
the ROI analyses (see Figure 2F and Tables 3, 4). 

In order to determine brain activity that was related to diffi- 
cult perceptual identification of a native phonetic contrast, the 
foreign-accented condition (which was difficult to perceive for 
both the native English speakers and the native Japanese speakers) 



was compared to the unaccented condition (which was easy to 
perceive for the native English speakers, but more difficult to 
perceive for the native Japanese speakers) between the Eng vs. 
the Jpn group using the contrast Eng(accented - unaccented) - 
Jpn(accented - unaccented) (random effects two-sample t-test). 
Only the pre-SMA activity was significant at p < 0.05 FDR cor- 
rected, therefore the analysis was conducted using a threshold 
of p < 0.001 uncorrected. Brain regions that showed significant 
differential activity for this contrast included the left and right 
Broca's area BA45, the pre-SMA, the right dorsolateral prefrontal 
cortex (DLPFC), the cerebellum lobule Vila, and the brain stem 
(see Figure 3 and Table 3). The same analysis using phonetic 
identification performance as a covariate of non-interest revealed 
activity only in left Broca's area using a threshold of p < 0.0015. 
The results of the ROI analysis using small volume correction 
for multiple comparisons revealed significant activity in the left 
PMvi, and the right cerebellum lobule VI (see Figure 4 and 
Table 4). When using performance as a covariate of non-interest, 
no significant differential activity was found when correcting for 
multiple comparisons within the ROIs (Table 4). 

Brain activity related to processing of foreign-accented pro- 
ductions of a second language phonetic category that was dif- 
ferent from processing of foreign-accented productions of a first 
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Table 1 | Conjunction of all conditions Eng Unaccented, Jpn 
Unaccented, Eng Accented, Jpn Accented (Figure 2G). 



Table 2 | Main contrast of language group. 



Brain region 



MNI coordinates 



PMvi, Broca's area, BA 6,44 
PMvs/PMd BA 6 

PostCG, I PL BA1.2 

Medial Frontal Cortex BA 9 Pre-SMA 
SPL BA7 
Insula BA13 
MTG/STG BA21.22 

Cerebellum Vermis 
Cerebellum Lobule VI 



-54,12,27 
51,6,21 
-51,6,39 
54,9,39 
39,-12, 51 
54,-30, 51 
-45,-30,39 
-6,12,57 
-27,-57,45 
-36,-33,24 
-63,-27-3 
66,-27,-3 
0,-78,-18 
-18,-54,-24 
27-66,-27 



Table showing clusters of activity for the conjunction of all contrasts relative 
to rest (Eng Unaccented, Eng Unaccented, Jpn Accented, Jpn Unaccented) 
thresholded at pFDR < 0.05 corrected with an extent threshold greater than 
5 voxels. Jpn., Japanese; Eng., English; Cor., corrected for multiple compar- 
isons; BA, Brodmann area; PMvi, Ventral inferior premotor cortex; PMvs, Ventral 
superior premotor cortex; PMd, Dorsal premotor cortex; PostCG, Postcentral 
gyrus; IPL, Inferior parietal lobule; pre-SMA, Pre-supplementary motor area; SPL, 
Superior parietal lobule; MTG, Middle temporal gyrus; STG, Superior tempo- 
ral gyrus. Negative x MNI coordinates denote left hemisphere and positive x 
values denote right hemisphere activity. 

language phonetic category was investigated using the contrast 
Jpn(accented) - Eng(accented). No significant activity was found 
using a corrected threshold of pFDR <0.05, therefore a thresh- 
old of p < 0.001 uncorrected was used. Activity was present in 
the right PMvi/Broca's area and the right PMvs/PMd. Using pho- 
netic identification performance as a covariate of non-interest 
revealed activity in right PMvi/Broca's area and right PMvs/PMd 
(see Figure 5A and Table 3). Using phonetic identification per- 
formance as a covariate of non-interest revealed activity in right 
PMvi/Broca's area, the right PMvs/PMd, and the left cerebellar 
lobule VI. For the ROI analysis, activity was significant in the right 
PMvi/Broca's area, right PMvs/PMd, and the left cerebellar lobule 
VI (see Figure 6 and Table 4). Using performance as a covariate 
of non-interest, the ROI analysis showed significant activity in left 
cerebellar lobule VI, and a trend toward significance in both right 
PMvi/Broca's area (p < 0.074) and right PMvs/PMd (p < 0.063) 
(see Table 4). 

To determine activity related to processing of unaccented 
productions of a second language phonetic category that was dif- 
ferent from that of unaccented productions of a first language 
phonetic category, the difference between the Jpn and Eng sub- 
jects for unaccented speech was investigated using the contrast 
Jpn(unaccented) - Eng(unaccented). No significant activity was 
found using a corrected threshold of pFDR < 0.05, therefore, a 
threshold ofp < 0.001 uncorrected was used. Activitywas present 
in left and right PMvi/Broca's area, right PMvs/PMd, right Boca's 
BA45, left IFG BA47, left PostCG, left IPL, and left cerebellar 



Brain region 


Jpn - Eng 


Eng - Jpn 




Accented + 


Accented + 




Unaccented 


Unaccented 




Figure 2H (red) 


Figure 2H (blue) 


PMvi, Broca's area, 


-45,0,8 




BA 6,44 


48,12,9 




PMvs/PMd BA 6 


-30,0,36 






30,0,39 






39, -15,60 




PostCG, IPL BA1.2 


-60,-18,21 






51,-24,60 




PostCG, IPL BA3 




-30,-24,48 


Superior medial gyrus 




-9,54,0 


BA10 






Medial frontal gyrus/SFG 




-30,30,24 


BA9 




-15,51,39 






18,33,33 


Middle frontal gyrus 




-30,34,-19 


BA11 






Anterior cingulate gyrus 




9,51,15 


Middle cingulate cortex 




-12,-39,42 


BA24.31 




12,-33,45 






12,-3,45 


i pi i n a a r\ 

IPL BA40 


-45,-39,39 




SPL BA7 






Insula BA13, 47 


36,18,-9 


-36,-18,15 


MTG /STG BA21.22 


-51,-48,6 




Angular gyrus BA39 




-54,-66,24 


MOG BA18.19 


-27,-69,30 


-36,-87,27 


Cuneus/Precuneus 




-9,-72,24 






24,-63,18 


Lingual gyrus BA18 




-9,-60,0 


Cerebellum 


-27,-69,-30 




LUUUIfc) Vila LIUo 1 


QQ RQ Qfi 
— J3, — Of, — OD 






21,-66,-36 




Cerebellum 


-6,-57,-30 




Lobule V 






Cerebellum 


-15,-72,-27 




Lobule VI 






Putamen 


30,9,0 





Table showing clusters of activity for the main effect of language group thresh- 
olded at pFDR < 0.05 corrected with an extent threshold greater than 5 voxels. 
Jpn., Japanese; Eng., English; Cor, corrected for multiple comparisons; BA, 
Brodmann area; PMvi, Ventral inferior premotor cortex; PMvs, Ventral supe- 
rior premotor cortex; PMd, Dorsal premotor cortex; PostCG, Postcentral gyrus; 
SFG, Superior Frontal Gyrus; IPL, Inferior parietal lobule; SPL, Superior Parietal 
Lobule; MTG, Middle Temporal Lobe; STG, Superior Temporal Lobe; MOG, 
Middle Occipital Gyrus. Negative x MNI coordinates denote left hemisphere 
and positive x values denote right hemisphere activity. 

lobules Vila and V, as well as left and right cerebellar lobule VI 
(see Figure 5B and Table 3). Using phonetic identification perfor- 
mance as a covariate of non-interest, the analysis revealed activity 
primarily in right PMvi/Broca's area, right PMvs/PMd, and left 
cerebellum lobule VI. The results of the ROI analysis using small 
volume correction for multiple comparisons revealed significant 
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Table 3 | MNI Coordinates of Clusters of Activity for Contrasts of Interest. 



Brain region 


Accented - Unaccented 


Accented - Unaccented 


Accented/rl/ 


Unaccented/rl/ 




/rl/ Identification 


/rl/ Identification(Eng) 


Identification 


Identification 




(Eng - Jpn) Figure 3 


Figure 2E 


(Jpn - Eng) Figure 5A 


(Jpn - Eng) Figure 5B 



PMvi, Broca's area, -48,10,4 (-48,12,-3) (-45,0,9) (48,3,0)48,9,15, 

BA 6,44 48, 1 2,6 (48, 1 2,6) 60, 1 5,3 

PMvs/PMdBA6 33,-15,48 27,0,36(27,0,36) 30,0,42(30,0,42) 

(57,0,45) (39,-15,66) 

Broca's Area BA 45 -51,30,6 1-54,30,3*) -42,33,6 1-48,30,9) 54,21,9 

54,27,18 

IFG BA47 -45,24,-12 (-45,24,-12) -30,21,-4 

Rolandic operculum -63,-18,21 (57,-12,12) 
BA43 

MFG BA8 (-51,15,42) 

MFC including 0,39,33**, 0,36,42 0,32,38, 0,29,50 (3,33,45) 

Pre-SMA 

SMA (-15,-6,66) 
DLPFC 54,30,30 

MTG BA21 (69,-18,-6) 

I PL BA 40 -45,-39,39, 

-30,-48,39 

SPL (-12,-51,66) 

Cerebellum Lobule V -15,-57-30 

Cerebellum Lobule VI 27,-60,-33(27,-60,-33) (-15,-57,-27) 21,-66,-36 

(-18,-57-30) 

Cerebellum Lobule VII 6,-81,-33 (-3,-69,-30) -27,-69,-30 

-9,-87,-27 
18,-72,-39 

Brain Stem 0,-30,-30 (6,-45,-36) 



Table showing clusters of activity for the various contrasts thresholded at p < 0.001 uncorrected. Coordinates in Parentheses denote those that are significant 
when using phonetic identification performance as a covariate of non-interest. Jpn., Japanese; Eng., English; BA, Brodmann area; PMvi, Ventral inferior premotor 
cortex; PMvs, Ventral superior premotor cortex; PMd, Dorsal premotor cortex; IFG, Inferior frontal gyrus; MFG, Middle frontal gyrus; MFC, Medial frontal cortex. 
SMA, Supplementary motor area; DLPFC, Dorsolateral Prefrontal Cortex; MTG, Middle Temporal Gyrus; IPL, Inferior Parietal Lobule; SPL, Superior parietal lobule. 
Negative x MNI coordinates denote left hemisphere and positive x values denote right hemisphere activity. 'Cluster was not significant when thresholded atp< 
0.001 uncorrected but was significant at p < 0.0015 uncorrected. "Significant at p < 0.05 FWE correcting for multiple comparisons across the entire volume. 



activity in left and right PMvi/Broca's, right PMvs/PMd and 
left and right cerebellum lobule VI (see Figure 7 and Table 4). 
These same brain regions were shown to have significant activa- 
tion (correcting for multiple comparisons) when using phonetic 
identification performance as a covariate of non-interest. 

DISCUSSION 

The goal of this study was to determine if there are differences in 
the level and/or patterns of activation for various brain regions 
involved with the processing of accented speech when distinct 
phonetic categories existed within a listener's language networks 
(first-language), relative to when listeners do not have well estab- 
lished phonetic categories (second-language) (i.e., English hi 
and III identification for native Jpn speakers). The conjunction 
analysis of all four conditions [Eng(accented), Eng(unaccented), 
Jpn(accented), Jpn(unaccented)] revealed that the same brain 
regions (STG/S, MTG, SMG, Broca's area, PMC, medial frontal 
cortex MFC/pre-suplementary motor area, and the cerebellum 
lobule VI) were active (see Figures 2A-D,G and Table 1). These 
results suggest that, to a large extent, it is the level of activity 
within these common regions that differs between conditions, 
rather than recruitment of different regions in the brain. It should 



be noted that, even for the Eng unaccented condition, there was 
common activation in speech motor regions. 

Increased brain activity during the presentation of accented 
first-language phonetic categories relative to unaccented phonetic 
categories [Eng(accented - unaccented)] was located primarily in 
the left and right cerebellum, as well as in left PMvi/Broca's area, 
and right PMvs/PMd (see Figure 2E, Tables 3, 4). These results 
were also found when using phonetic identification performance 
as a covariate of non-interest. When general stimulus and subject 
variables were controlled for, using the contrast of Eng(accented - 
unaccented) - Jpn(accented - unaccented), the brain regions 
with significant activation included the pre-SMA, the right cere- 
bellum, left Broca's area BA45, and the left PMvi/Broca's area (see 
Figures 3, 4, Tables 3, 4). However, when using performance as a 
covariate of non-interest, only left Broca's area BA45 showed sig- 
nificant activity (see Tables 3, 4). Broca's area BA45 is thought to 
provide a contextual supporting role to the mirror neuron system 
(Arbib, 2010). PMvi/Broca's area and the cerebellum are hypoth- 
esized to be regions that instantiate the articulatory — auditory 
models that are involved with both speech production and 
perception (Callan et al., 2004a; Tourville and Guenther, 2011; 
Guenther and Vladusich, 2012). The left hemisphere activity 
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Table 4 | ROI analysis using small volume correction for contrasts of interest. 



Brain region 


SVC center 


Accented — Unaccented 


Accented - Unaccented 


Accented /rl/ 


Unaccented /rl/ 




Coordinate 


/rl/ identification (Eng - 


/rl/ Identification (Eng) 


Identification (Jpn - 


Identification (Jpn - 




(8 mm radius) 


Jpn) Figure 4 




Figure 2E 


Eng) Figure 6 


Eng) Figure 7 






pCor. x,y,z 


pCor. 


x,y,z 


pCor. 


X V z 


pCor. 


X V z 


PMvi, Broca's BA6,44 


—51 ,9,21 


0.030 —48,12,21 


0.092 


-576,21 


n.s. 




0.042 


-54,9,27 




51 ,15,18 


n.s. - 


n.s. 




0.045 


48,9,15 


0.006 


48,9,15 




CovPerf 


















—51 ,9,21 


n.s. 


n.s. 


- 


n.s. 


- 


n.s. 


- 




51,15,18 


n.s. 


n.s. 


- 


0.074 


48,12,12 


n.s. 


- 


PMvs 


—36,-3,57 


n.s. - 


0.081 


-36,0,51 


n.s. 


- 


n.s. 


- 




27,-3,51 


n.s. - 


n.s. 


- 


0.036 


27,0,45 


0.006 


27,0,45 




CovPerf 


















—36,-3,57 


n.s. 


0.057 


-36,0,51 


n.s. 




n.s. 






27,-3,51 


n.s. - 






0.063 


27,0,45 


0.027 


21,-6,51 


STG/S 


—57—39,9 
CovPerf 


n.s. - 


0.075 


-57-36,3 


n.s. 




n.s. 






-57-39,9 


n.s. 


0.091 


-57,-36,3 


n.s. 




n.s. 


n.s. 


CerebellumLobule VI 


-27,-63,-39 


n.s. 


0.011 


-30,-57-36 


0.042 


-27,-66,-33 


0.011 


-27,-66,-33 




30,-66,-33 


0.034 36,-72,-33 


0.025 


27,-60,-33 


n.s. 




0.028 


24,-69,-33 




CovPerf 


















-27,-63,-39 


n.s. 


0.012 


-30,-57,-36 


0.039 


-27,-66, -33 


0.005 


-21,-60,-39 




30,-66,-33 


n.s. 


0.024 


27,-60,-33 


n.s. 




0.033 


33,-63,-39 



Table showing results of small volume correction analysis (p < 0.05) for multiple comparisons for selected contrasts within regions of interest using MNI coordinates 
specified in Callan et al. (2004a) as the seed voxels. The first set of results is for the original analysis. The second set of results, under the heading of CovPerf, 
is for the analysis in which phonetic identification performance is used as a covariate of non-interest. SVC, Small volume correction; ROI, Region of Interest; BA, 
Brodmann area; PMvi, Premotor cortex ventral inferior; PMvs, Premotor cortex ventral superior, n.s., Not significant at p < 0.05 corrected. pCor., p corrected for 
multiple comparisons within the SVC small volume corrected region of interest. Negative x MNI coordinates denote left hemisphere and positive x values denote 
right hemisphere activity. 




FIGURE 3 | Significant brain activity (thresholded at p < 0.001 
uncorrected) for the interaction of language group and accent. 

This contrast focused on the activity involved with perception of 
foreign-accented productions of a first-language phonetic category. (A) 
Significant brain activity rendered on the surface of the brain for the 
contrast of Eng(accented-unaccented) - Jpn(accented-unaccented) 



showing activity in pre- supplementary motor area pre-SMA, left and 
right Broca's area BA45, right dorsolateral prefrontal cortex DLPFC, 
and left and right cerebellum. (B-E) shows contrast estimates and 
standard error of the SPM analysis relative to rest for the four 
conditions in selected regions: (B) Pre-SMA, (C) PMvi, (D) Broca's, 
(E) Cerebellum. 
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Eng(Accented-Unaccented) - Jpn( Accented-Unaccented) 
Left PMvi/Broca's 
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-Eng 
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FIGURE 4 | Region of interest (ROI) analysis for the contrast of 
Eng(accented-unaccented) - Jpn(accented-unaccented) using small 
volume correction (p < 0.05) for multiple comparisons. (A) left 
PMvi/Broca's area. (B) Right Cerebellum Lobule VI. MNI X, Y, Z coordinates 



are given at the top of each brain slice. Negative x MNI coordinates denote 
left hemisphere and positive x values denote right hemisphere activity. The 
SPM contrast estimates and standard error relative to rest for all four 
conditions are given on the left side of each ROI rendered image. 



A Jpn( Accented) - Eng(Accented) 




B Jpn(Unaccented) - Eng(Unaccented) 




FIGURE 5 | (A) Contrast investigating specific brain regions involved with 
the perception of foreign-accented productions of a second-language 
phonetic category, Jpn(accented) - Eng(accented). Activity is present in the 
right ventral inferior premotor cortex including Broca's area PMvi/Broca's 
right ventral superior premotor cortex PMvs. (B) Activity for perception of 
foreign-accented productions of a second language phonetic category that 
may not be specific Jpn(unaccented) - Eng(unaccented) was found in the 
left and right PMvi/Broca's, the right PMvs/PMd, the right Broca's area 
BA45, the left inferior frontal gyrus BA47, the left postcentral gyrus, the left 
inferior parietal lobule, and the left and right cerebellum. 



observed in Broca's area BA 45 and PMvi/Broca's area, is con- 
sistent with other studies that showed only left hemisphere 
activity for speech perception tasks that required phonetic 
processing (Demonet et al, 1992; Price et al, 1996). The presence 
of increased activity in speech motor regions observed in this 
study, and the lack of significant differential activity in the 
STG/S, are consistent with the hypothesis that neural processes 



involved with auditory — articulatory mappings are used to 
facilitate the perception of foreign-accented productions of one's 
first language. However, the absence of differential activity in 
auditory regions for this contrast does not indicate that auditory 
processes are not important for intelligibility and perceptual 
categorization. 

The activity present in the MFC that included the pre-SMA 
for all conditions (see Figure 2 and Table 1) is interesting given 
that several studies suggest that this region may be involved 
with value and contex-dependent selection of actions (Deiber 
et al, 1999; Lau et al, 2004; Rushworth et al, 2004). Activity 
found in the MFC/Pre-SMA in this study may represent value 
and context dependent selection of internal models. It is impor- 
tant to note that the contrast Eng (accented) vs. Jpn (accented) 
showed greater activity in the MFC (see Figures 3, 4, Tables 3, 4). 
This was also true when phonetic identification performance was 
used as a covariate of non-interest. This suggests greater use of 
value-dependent context for selection when internal models are 
well established (as is thought to be the case for /r/ and IV for 
native English speakers). This region was also displayed signifi- 
cant activation when the Eng vs. Jpn groups were compared (see 
Figure 2H, Table 2). The greater extent of activity in these regions 
compared to the Callan et al. (2004a) study may be explained 
by the larger number of speakers used for the stimuli in this 
study, which could have resulted in considerably more context 
variability. 

Brain regions specific to the perception of foreign-accented 
productions of phonetic categories from one's second lan- 
guage, when controlling for task difficulty [Jpn(accented) - 
Eng(accented)], was localized in right PMvi/Broca's area, right 
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Jpn(Accented) - Eng (Accented) 



Right PMvi/Broca's 



Y=9 



Z=15 



PMvi 48,9,15 



0.35 
0.3 

0.25 
0.2 

0.15 
0.1 
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0 



Eng 
-Jpn 




Right PMvs/PMd 



Unaccented Accented 



PMvs 27,0,45 




: 0.05) 



FIGURE 6 | Region of interest analysis for the contrast of 
Jpn(accented) - Eng(accented) using small volume correction (p - 

for multiple comparisons. (A) Right PMvi/Broca's area. (B) Right 
PMvs/PMd. (C) Left Cerebellum Lobule VI. MNI X, Y, Z coordinates are given 



Unaccented Accented 



at the top of each brain slice. Negative x MNI coordinates denote left 
hemisphere and positive x values denote right hemisphere activity. The SPM 
contrast estimates and standard error relative to rest for all four conditions 
are given on the left side of each ROI rendered image. 



PMvs/PMd, and the left cerebellum (see Figures 5A, 6, and 
Tables 3, 4). These results are also true when using phonetic 
identification performance as a covariate of non-interest. Task 
difficulty was controlled for by presenting foreign accented speech 
(English lr\l phonetic contrast) that was difficult for both native 
English and native Japanese speakers to correctly identify. It 
is important to point out that behavioral performance during 
the fMRI experiment revealed no significant difference between 
native English and native Japanese speakers for the foreign 
accented stimuli, which suggests similar levels of task difficulty 
for both groups. 

The contrast Jpn(unaccented) - Eng(unaccented) revealed 
activity in right PMvi/Broca's, right PMvs/PMd, the right and 
the left cerebellum (see Figures 5B, 7 and Tables 3, 4). Activity in 
these regions was also present when using phonetic identification 
as a covariate of non-interest. The presence of activity in right 
PMvs/PMd for the Jpn(accented) - Eng(accented) contrast and 
the Jpn(unaccented) - Eng(unaccented) contrast suggests that the 
results found are not specific to acoustic properties inherent in 
accented speech. It should be noted that no significant activity 
was found in the STG/S, which is thought to be involved with 
auditory-based speech processing. 



It should be acknowledged that difference in the number 
of men and women in the Eng and the Jpn groups may be 
responsible for the between-group differences reported here. 
However, the Eng (Accented - Unaccented) - Jpn (Accented - 
Unaccented) should control for such subject differences. As well, 
we believe that it is unlikely that gender differences between the 
groups contributed to our results, given that Callan et al. (2004a) 
did not find gender differences using a very similar task. In addi- 
tion, no gender differences were found in another study that 
employed speech production tasks (Buckner et al., 1995). 

It has been previously suggested that activity in speech motor 
regions (PMC and Broca's area) may not be involved with 
speech intelligibility, but rather reflect differences in cognitive 
processes related to task difficulty, such as attention and work- 
ing memory (Hickok and Poeppel, 2007; Poeppel et al., 2008; 
Lotto et al, 2009; Scott et al, 2009). While all four of the 
primary contrasts investigated in this study controlled for gen- 
eral processes related to the phonetic categorization task, only 
the contrast Jpn(accented) - Eng(accented) adequately con- 
trolled for task difficulty. The other two primary contrasts of 
interest [Jpn(unaccented) - Eng(unaccented) and Eng(accented- 
unaccented) - Jpn(accented-unaccented)] did not. 
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FIGURE 7 | Region of interest analysis for the contrast of 
Jpn(unaccented) - Eng(unaccented) using small volume correction 
(p < 0.05) for multiple comparisons. (A) Left PMvi/Broca's area. (B) 
Right PMvi/Broca's area. (C) Right PMvs/PMd. (D) Left Cerebellum 
Lobule VI. (E) Right Cerebellum Lobule VI. MNI X, Y Z coordinates 



Unaccented Accented 

are given at the top of each brain slice. Negative x MNI coordinates 
denote left hemisphere and positive x values denote right hemisphere 
activity. The SPM contrast estimates and standard error relative to rest 
for all four conditions are given on the left side of each ROI rendered 
image. 



Pertinent to the issue of controlling for extraneous brain 
activity related to aspects of task difficulty, the four primary 
contrasts in this study were analyzed using phonetic identifica- 
tion performance as a covariate of non-interest. The results (see 



Tables 3, 4) showed that many of the same regions (including 
the PMC, Broca's area, and the cerebellum) were still found to 
be differentially active when performance was used as a covariate 
of non-interest. One drawback of using phonetic identification 
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performance as a covariate of non-interest to control for task dif- 
ficulty is that brain activity related to the processes of enhancing 
speech perception is likely removed by the analysis. 

Of particular interest is the finding that while the percep- 
tion of foreign-accented productions of a first language is related 
to increased activity in left PMvi/Broca's area and the right 
cerebellum, brain regions involved in the perception of foreign- 
accented productions of a second language differentially acti- 
vate right PMvs/PMd and the left cerebellum instead. While 
left PMvi/Broca's area is thought to be involved with articu- 
latory and sensory aspects of phonetic processing (Guenther 
and Vladusich, 2012), the right premotor cortex is thought to 
be involved with articulatory-to-auditory mapping for feedback 
control (Tourville and Guenther, 2011). These results are con- 
sistent with the hypothesis that the establishment of non-native 
phonetic categories (when the second-language is acquired after 
childhood) involves greater reliance on general articulatory-to- 
auditory feedback control systems. These systems are thought 
to be instantiated in right hemisphere PMC, and generate audi- 
tory predictions based on articulatory planning (Tourville and 
Guenther, 2011; Guenther and Vladusich, 2012). 

Selective activity in right PMC and the left cerebellum (cere- 
bellar cortical anatomical connectivity is predominantly crossed) 
is consistent with the hypothesis that internal models in the 
non-dominant hemisphere are utilized more extensively under 
conditions in which there is interference between established 
categorical representations and new representations during pro- 
cessing. Some additional evidence consistent with this hypothesis 
comes from studies in which non-native speech training led to 
enhanced activity in right PMC and Broca's area (Callan et al., 
2003b; Wang et al, 2003; Golestani and Zatorre, 2004) and the left 
cerebellum (Callan et al., 2003b). Also consistent are the results 
of some studies investigating second-language processing that 
showed greater differential activity for second-language process- 
ing than for first-language processing in right PMC and Broca's 
area (Dehaene et al., 1997; Pillai et al., 2003) and the left cere- 
bellum (Pillai et al, 2004). However, there are several studies 
that do not show any difference in brain activity between first- 
and second-language processing (Klein et al., 1995; Chee et al., 
1999; Illes et al, 1999). It is important to note that even though 
the results of this study support the hypothesis that right Broca's 
area and the left cerebellum are differentially involved in the pro- 
cessing of foreign-accented productions of a second language, 
left Broca's area and the right cerebellum are involved with gen- 
eral processing of foreign-accented phonemes for both first- and 
second-language listeners (see Tables 3, 4). Although it is thought 
that the activity in the left cerebellum and right Broca's area rep- 
resents articulatory-auditory internal models, it is possible that 
the activity represents articulatory-orosensory internal models or 
both articulatory-auditory and articulatory-orosensory internal 
models. Further experiments are needed to discern the types of 
internal models used under differing conditions. 

The activation in left and right cerebellar lobule VI was within 
the region known to be involved with lip and tongue represen- 
tation (Grodd et al., 2001). Given the predominantly crossed 
anatomical connectivity between the cerebellum and cortical 
areas, the finding of left PMC and right cerebellar activity that was 



found is consistent with the use of internal models for processing 
first-language phonemes. In contrast, the right PMC and left cere- 
bellar activity that was found is consistent with the use of internal 
models used differentially for perception of foreign-accented pro- 
ductions of a second language. These results are consistent with 
crossed patterns of functional connectivity from the cerebellum 
to Broca's area that have been associated with tool use (Tamada 
et al., 1999). This region of the cerebellum has also been iden- 
tified to be involved with speech perception and production in 
other studies (Ackermann et al., 2004; Callan et al., 2004a). 

The finding of cerebellar activity involved in the perception 
of foreign-accented speech is consistent with a recent study that 
showed greater activity in the cerebellum after adaptation to 
acoustically distorted speech (Guediche et al., 2014). In contrast 
to our hypotheses concerning the use of forward and inverse 
(articulatory-auditory) internal models, Guediche et al. (2014) 
concluded that the cerebellum utilizes supervised learning mech- 
anisms that rely purely on sensory prediction error signals for 
speech perception. 

Another potential explanation of the results differentiating 
between processing of foreign-accented speech between first- and 
second-language speakers could be that there is recruitment of 
extra neural resources when undertaking tasks for which we 
are not trained. It has been shown, for example, that experi- 
enced singers, in which much of the processing is automated, 
show reduced activity relative to non-experienced singers (Wilson 
et al., 2011). It is unlikely that the results of our study can be 
explained by differences in task training and expertise, as the 
foreign-accented speech was difficult for both the English and 
Japanese groups, and the subjects had the same amount of train- 
ing on the phonetic categorization task. As well, there was no 
significant difference in behavioral performance between the two 
groups (see Figure 1). However, it may be the case that very dif- 
ferent processes are recruited when distinct phonetic categories 
exists (first-language perception), vs. when they do not (second- 
language perception). Although our results are consistent with the 
hypothesis that the establishment of second-language phonetic 
categories involves general articulatory-to-auditory feedback con- 
trol systems in right hemisphere PMC — which generate auditory 
predictions based on articulatory planning, it cannot be ruled out 
that the pattern of differential activity reflects meta-cognitive pro- 
cessing strategies that result from the task requirement to identify 
phonetic categories that either are either from one's first or 
second-language. The processes maybe more automatic for native 
speakers (or speakers with well-established phonetic categories) 
than for non-native speakers. 

CONCLUSION 

The results of this study suggest that perception of foreign- 
accented phonetic categories involves brain regions that support 
aspects of speech motor control. For perception of foreign- 
accented productions of a first language, the activation in left 
PMvi/Broca's area, right cerebellum lobule VI, and the pre-SMA 
are consistent with the hypothesis that internal models instanti- 
ating auditory-articulatory mappings of phonemes are selected 
to facilitate perception. Brain regions selective for perception of 
second-language phonetic categories include right PMvi/Broca's, 
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right PMvs/PMd, and the left cerebellum and are consistent with 
the hypothesis that articulatory-to-auditory mappings used for 
feedback control of speech production are used to facilitate pho- 
netic identification. The lack of activity in the STG/S for any 
of the contrasts under investigation would tend to refute the 
hypotheses that strong engagement of bottom-up auditory pro- 
cessing facilitates speech perception of foreign-accented speech 
under these conditions. Brain regions involved with articulatory- 
auditory feedback for speech motor control may be a precursor 
for development of perceptual categories. 
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